Towards Seamless Tracking-Free Web: Improved Detection of Trackers via One-class Learning

نویسندگان

  • Muhammad Ikram
  • Hassan Jameel Asghar
  • Mohamed Ali Kâafar
  • Balachander Krishnamurthy
  • Anirban Mahanti
چکیده

Numerous tools have been developed to aggressively block the execution of popular JavaScript programs in Web browsers. Such blocking also affects functionality of webpages and impairs user experience. As a consequence, many privacy preserving tools that have been developed to limit online tracking, often executed via JavaScript programs, may suffer from poor performance and limited uptake. A mechanism that can isolate JavaScript programs necessary for proper functioning of the website from tracking JavaScript programs would thus be useful. Through the use of a manually labelled dataset composed of 2,612 JavaScript programs, we show how current privacy preserving tools are ineffective in finding the right balance between blocking tracking JavaScript programs and allowing functional JavaScript code. To the best of our knowledge, this is the first study to assess the performance of current web privacy preserving tools. To improve this balance, we examine the two classes of JavaScript programs and hypothesize that tracking JavaScript programs share structural similarities that can be used to differentiate them from functional JavaScript programs. The rationale of our approach is that web developers often “borrow” and customize existing pieces of code in order to embed tracking (resp. functional) JavaScript programs into their webpages. We then propose one-class machine learning classifiers using syntactic and semantic features extracted from JavaScript programs. When trained only on samples of tracking JavaScript programs, our classifiers achieve an accuracy of 99%, where the best of the privacy preserving tools achieved an accuracy of 78%. The performance of our classifiers is comparable to that of traditional two-class SVM. One-class classification, where a training set of only tracking JavaScript programs is used for learning, has the advantage that it requires fewer labelled examples that can be obtained via manual inspection of public lists of well-known trackers. We further test our classifiers and several popular privacy preserving tools on a larger corpus of 4,084 websites with 135,656 JavaScript programs. The output of our best classifier on this data is between 20 to 64% different from the tools under study. We manually analyse a sample of the JavaScript programs for which our classifier is in disagreement with all other privacy preserving tools, and show that our approach is not only able to enhance user web experience by correctly classifying more functional JavaScript programs, but also discovers previously unknown tracking services.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online multiple people tracking-by-detection in crowded scenes

Multiple people detection and tracking is a challenging task in real-world crowded scenes. In this paper, we have presented an online multiple people tracking-by-detection approach with a single camera. We have detected objects with deformable part models and a visual background extractor. In the tracking phase we have used a combination of support vector machine (SVM) person-specific classifie...

متن کامل

Detecting and Defending Against Third-Party Tracking on the Web

While third-party tracking on the web has garnered much attention, its workings remain poorly understood. Our goal is to dissect how mainstream web tracking occurs in the wild. We develop a client-side method for detecting and classifying five kinds of third-party trackers based on how they manipulate browser state. We run our detection system while browsing the web and observe a rich ecosystem...

متن کامل

Like a Pack of Wolves: Community Structure of Web Trackers

Web trackers are services that monitor user behavior on the web. The information they collect is ostensibly used for customization and targeted advertising. Due to rising privacy concerns, users have started to install browser plugins that prevent tracking of their web usage. Such plugins tend to address tracking activity by means of crowdsourced filters. While these tools have been relatively ...

متن کامل

MBest Struct: M-Best diverse sampling for structured tracker

We approach the problem of model-free visual tracking of objects in videos. Modelfree tracking has its state-of-the-art in a class of methods called tracking-by-detection, as shown in recent benchmarks. Some top-performing methods use deep neural networks (i.e., convnets) to solve the learning-based steps of the tracking algorithm (e.g., bounding-box prediction and evaluation). Despite improvin...

متن کامل

MBestStruck: M-Best diverse sampling for structured tracker

We approach the problem of model-free visual tracking of objects in videos. Modelfree tracking has its state-of-the-art in a class of methods called tracking-by-detection, as shown in recent benchmarks. Some top-performing methods use deep neural networks (i.e., convnets) to solve the learning-based steps of the tracking algorithm (e.g., bounding-box prediction and evaluation). Despite improvin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PoPETs

دوره 2017  شماره 

صفحات  -

تاریخ انتشار 2017